2 research outputs found
Unsupervised Domain Adaptation for Semantic Segmentation using One-shot Image-to-Image Translation via Latent Representation Mixing
Domain adaptation is one of the prominent strategies for handling both domain
shift, that is widely encountered in large-scale land use/land cover map
calculation, and the scarcity of pixel-level ground truth that is crucial for
supervised semantic segmentation. Studies focusing on adversarial domain
adaptation via re-styling source domain samples, commonly through generative
adversarial networks, have reported varying levels of success, yet they suffer
from semantic inconsistencies, visual corruptions, and often require a large
number of target domain samples. In this letter, we propose a new unsupervised
domain adaptation method for the semantic segmentation of very high resolution
images, that i) leads to semantically consistent and noise-free images, ii)
operates with a single target domain sample (i.e. one-shot) and iii) at a
fraction of the number of parameters required from state-of-the-art methods.
More specifically an image-to-image translation paradigm is proposed, based on
an encoder-decoder principle where latent content representations are mixed
across domains, and a perceptual network module and loss function is further
introduced to enforce semantic consistency. Cross-city comparative experiments
have shown that the proposed method outperforms state-of-the-art domain
adaptation methods. Our source code will be available at
\url{https://github.com/Sarmadfismael/LRM_I2I}
Crowd counting via joint SASNet and a guided batch normalization network
Recent studies on crowd counting have achieved promising results by using convolutional neural network (CNN) architectures. However, due to the large variation in scene distribution in real-world crowd datasets, it remains a challenge to achieve high performance using standard CNN methods. Such methods often suffer from performance drops. To address this challenge, this paper proposes a new crowd-counting approach that combines three state-of-the-art methods: Guided-Batch-Normalization, which adapts the model using unseen dataset normalization parameters; the Scale Adaptive Selection Network, which uses a multi-level network to obtain variation feature representations; and Distribution-Matching-Count, which uses a new loss function between prediction and ground truth maps. Combining these methods results in improved performance. Extensive experiments across multiple datasets have demonstrated that the proposed approach outperforms state-of-the-art methods